[00:03.370 --> 00:07.550]  I'm sure many of you have chip cards in your pocket which you use for credit and debit card
[00:07.550 --> 00:13.570]  transactions. These use the EMV standard for communication between the card and the terminal
[00:13.570 --> 00:20.070]  while performing one of these transactions. EMV is named after the original creators of the
[00:20.070 --> 00:25.450]  standard, Europay, Mastercard and Visa, but since then maintenance of the standard has been taken
[00:25.450 --> 00:32.570]  over by EMVco, who also manage closely related specifications on contactless payments and
[00:32.570 --> 00:38.270]  some other payment standards like the 3D Secure online payment standard.
[00:38.870 --> 00:44.430]  I've been looking into EMV for about 15 years now. I've found a variety of vulnerabilities
[00:45.050 --> 00:50.550]  both in the standard and in the implementations, but in this talk I'm not going to be covering
[00:50.550 --> 00:55.290]  these vulnerabilities. If you're interested I've got other talks and papers about these.
[00:55.370 --> 01:01.490]  Instead in this talk I'll be looking at some of the low-level details about how EMV works. I hope
[01:01.490 --> 01:07.810]  this might trigger your interest in looking into EMV in more detail yourself and also perhaps will
[01:07.810 --> 01:14.930]  be useful as part of your work. Given the billions of pounds and dollars that flow through EMV
[01:14.930 --> 01:20.890]  transactions you'd think that EMV was designed to be a security specification, but it's not really.
[01:20.930 --> 01:27.770]  Primarily EMV is designed to be a compatibility specification. It's there to allow cars and
[01:27.770 --> 01:34.350]  terminals to communicate regardless of how old or new the equipment is, regardless of where in the
[01:34.350 --> 01:40.030]  world these are and regardless of the banks that have been used to set up the terminal and the
[01:40.030 --> 01:47.970]  cards. This is because when people perform EMV transactions they want to buy something. They are
[01:47.970 --> 01:54.410]  not primarily interested in security and if something went wrong they would rather the
[01:54.410 --> 02:00.990]  transaction went through than not. And so most of the EMV specification is about ensuring compatibility
[02:01.470 --> 02:07.570]  while also managing the constraints. For example, cards are very limited in the amount of RAM they
[02:07.570 --> 02:12.210]  have and the amount of processing they have and this has a lot of impact in how the specification
[02:12.210 --> 02:20.850]  is designed. Another consequence of this desire for compatibility is that cards and terminals try
[02:20.850 --> 02:27.130]  to follow Pestale's law. So they are conservative in what they generate but liberal in what they
[02:27.130 --> 02:33.430]  expect. The idea is that if there are some changes to the way that cards and terminals work
[02:33.430 --> 02:38.690]  they'll continue to operate because they will not just reject a transaction because something
[02:38.690 --> 02:45.730]  is unexpected. This is very desirable for compatibility but has the consequence of
[02:45.730 --> 02:52.270]  creating security risks. Firstly, if there is data which is not possible to interpret
[02:52.850 --> 02:58.410]  devices will ignore it but maybe that is very security relevant information that they're
[02:58.410 --> 03:05.910]  ignoring. And secondly, having this backwards and forwards compatibility creates more complexity
[03:05.910 --> 03:12.610]  and from complexity we can get security problems. One way the EMV standard tries to support
[03:12.610 --> 03:22.730]  compatibility is the TLV standard for encoding data. In this standard data is encoded as a tag
[03:22.730 --> 03:31.310]  followed by a length followed by a value. This is a very efficient way of representing hierarchical
[03:31.310 --> 03:37.890]  data structures. If this was a more modern standard you might see things like JSON or XML.
[03:37.890 --> 03:47.010]  In the case of JSON and XML they take up a lot of extra space to encode data. In contrast TLV is
[03:47.010 --> 03:54.590]  much much more efficient. The downside of TLV is that while with JSON or XML you can look at data
[03:54.590 --> 04:00.270]  and more or less work out what's going on, in the case of TLV data you have to look quite carefully
[04:00.270 --> 04:08.190]  before you can work out what this actually means. TLV data helps compatibility because it's possible
[04:08.190 --> 04:15.230]  to decode without the decoder understanding all the semantics of the data that's encoded inside.
[04:15.230 --> 04:22.810]  A decoder can still produce this tree and identify data items that it knows about but it can ignore
[04:22.810 --> 04:29.690]  data items that it doesn't know about. It also has some other features like allowing data items to be
[04:29.690 --> 04:37.370]  deleted just by filling with zeros rather than having to rewrite the whole string of bytes.
[04:37.370 --> 04:44.630]  This is convenient for older memory technologies that allowed erasure but not moving around the data.
[04:46.110 --> 04:55.590]  The TLV standard is also known as ASN1 BER. So this is from the X208 standard and it's the
[04:55.590 --> 05:02.690]  standard that is also used as the encoding format for HTTPS certificates. There's been plenty of
[05:02.690 --> 05:11.070]  security vulnerabilities as a result in the HTTPS standards. In this talk I'm going to show how you
[05:11.070 --> 05:18.350]  can manually decode TLV data. Now of course there's decoders out there. I've written some and I'll
[05:18.350 --> 05:24.770]  talk about those at the end of the talk. But there's sometimes cases where it's helpful to be
[05:24.770 --> 05:29.990]  able to decode these things manually. So maybe the data you've got is incomplete or corrupt but you
[05:29.990 --> 05:36.410]  still need to make sense of it. Maybe you have to explain exactly why a particular decoding is the
[05:36.410 --> 05:41.430]  right one. One of the things I do from time to time is expert witness in a court case and there
[05:41.430 --> 05:46.710]  you not only have to give the answer but also explain the rationale of why you're making the
[05:46.710 --> 05:53.010]  recommendation that you are making. You might have to write your own decoder although I wouldn't
[05:53.010 --> 05:58.570]  recommend it. There's lots of places you can go wrong. And also sometimes doing things yourself
[05:59.090 --> 06:04.570]  shows you where other people might have slipped up and help you identify where else security
[06:04.570 --> 06:12.110]  vulnerabilities might be. So in this talk I'm going to show these decoding processes and if you want
[06:12.110 --> 06:18.050]  to follow along then these links here will take you to the Jupyter notebook which I've used for
[06:18.050 --> 06:25.690]  doing the decoding. A document with my notes to show some of the resources that are necessary to
[06:25.690 --> 06:33.370]  understand TLV data in EMV. And there's also the repository which contains the Python notebook
[06:33.370 --> 06:43.400]  and the source code for the utilities that I've been using. In this talk I'll be decoding some
[06:43.400 --> 06:51.860]  TLV data structures. These are represented in hexadecimal. So to help us decode these I've
[06:51.860 --> 06:58.560]  written some simple Python functions that allow us to manipulate these strings of hexadecimal
[06:58.560 --> 07:04.620]  characters. These are part of the hexutils package. You can download it from the repository
[07:04.620 --> 07:09.080]  that I linked to earlier and you can also use this within the Jupyter notebook.
[07:09.980 --> 07:17.960]  The first function takes a byte represented as a pair of characters, hexadecimal characters,
[07:17.960 --> 07:24.960]  and then converts it to binary. It always converts it to to eight bits so it'll handle adding reading
[07:24.960 --> 07:33.660]  zeros if necessary. Sometimes these strings will have spaces around it that's very nice for
[07:33.660 --> 07:39.680]  formatting but can complicate decoding. So the function stripBytes removes any whitespace
[07:40.380 --> 07:49.490]  before, after, and within the hexadecimal string. This one adds whitespace back in to make it easier
[07:49.490 --> 08:00.060]  to look at. It splits the hex string into bytes and then adds a space between each byte. Sometimes
[08:00.060 --> 08:05.460]  we want to count how long a hex string is. So this counts how many bytes are in a hex string,
[08:05.460 --> 08:15.290]  removing any whitespace beforehand. Sometimes the hex bytes include text.
[08:15.290 --> 08:21.930]  Typically this would be encoded using ASCII. So this function decodes a hex string
[08:22.440 --> 08:32.450]  into the equivalent string using ISO 8859-1 which is a superset of ASCII.
[08:35.790 --> 08:41.270]  Now into some slightly more interesting functions. Quite often we all need to look at the
[08:41.270 --> 08:49.410]  individual bits and at what position bits are set to 1 or 0. So this function takes in a byte
[08:49.410 --> 08:57.570]  and in the hexadecimal and then shows you which bits are set to 1 and which bits are set to 0.
[08:59.940 --> 09:05.220]  Sometimes there actually might be multiple fields within a single byte. So this function
[09:05.220 --> 09:12.880]  takes a specification of how long each field is, converts the hex into binary, and then shows
[09:12.880 --> 09:17.600]  each of these binary digits according to which field it belongs to.
[09:21.650 --> 09:29.550]  I'm going to be taking sections out of hex strings. So the take function does this.
[09:29.770 --> 09:34.770]  In the first format it just takes a certain number of bytes. So in this case you have a
[09:34.770 --> 09:39.350]  hex string and then you ask for two bytes and it gives you these two bytes.
[09:40.590 --> 09:46.650]  And in the second form it takes two bytes but also takes an offset. So here I'm starting
[09:46.650 --> 09:51.310]  offset 1 counting from zero. That means I'm skipping the first byte and then taking the
[09:51.310 --> 10:03.880]  next two bytes. In this talk I'm going to show how to decode some real EMV data and I took some
[10:03.880 --> 10:09.140]  data off my own credit card. Now don't get too excited, the card's long since expired
[10:09.140 --> 10:14.560]  so you won't be seeing any sensitive information. But this is a real card.
[10:14.920 --> 10:20.700]  To actually extract this data I used a tool called CardPeek. This is really handy. It can
[10:20.700 --> 10:26.540]  deal with many different types of smart cards. It can do TLV decoding and other types of decoding by
[10:26.540 --> 10:32.040]  itself. But the whole point of this talk is to show you how to do this yourself. So rather than
[10:32.040 --> 10:36.880]  looking at the final output of CardPeek I'm going to look at the log file from this.
[10:37.880 --> 10:44.600]  One of the stages of getting data off a card is to use the read record command.
[10:44.600 --> 10:49.600]  To see how this works let's look at the EMV specification. Rather than giving the whole
[10:49.600 --> 10:55.600]  spec I've taken some selected parts and these are available in the notes.
[11:00.740 --> 11:05.920]  So here is the specification for the read record command.
[11:06.320 --> 11:13.060]  EMV commands take two different bytes to specify the command. There's the class and the instruction.
[11:13.060 --> 11:22.100]  So for read record class is 00, instruction is B2. And then it takes two parameters P1 and P2.
[11:22.160 --> 11:28.340]  These parameters depend on the actual command. So if we look up the specification for read record
[11:28.340 --> 11:32.100]  we'll see that P1 is the record number we're going to read,
[11:32.100 --> 11:37.440]  and then P2 is a reference control parameter. And I'll talk about that a bit later.
[11:38.880 --> 11:46.260]  We need to specify how long the record is. We need to specify the length expected,
[11:46.260 --> 11:51.920]  but initially we don't know what this is. So the way to deal with this is you initially specify
[11:51.920 --> 11:58.560]  0 as the length and then the card will return an error message. So 6c is an error message
[11:58.560 --> 12:05.080]  and tell you how long the response is actually. And then you call the same command again now
[12:05.080 --> 12:10.440]  specifying the correct length and it will return 9000 which means everything is okay.
[12:10.440 --> 12:15.260]  And then the 97 in hexadecimal bytes that you've requested.
[12:17.360 --> 12:20.740]  So I mentioned that there's a reference control parameter.
[12:20.740 --> 12:28.920]  This is our first example of one of these fields where you are taking a single byte and splitting
[12:28.920 --> 12:38.120]  up. So here is the table that shows how one of these bytes is formatted. The first five bits
[12:38.660 --> 12:45.960]  is the SFI, the short file identifier. We found out the short file identifier earlier on when we
[12:45.960 --> 12:51.640]  activated the payment application on this card and we found out that the relevant records are
[12:51.640 --> 13:00.200]  in SFI 2. So the first five bits get set to 2 and then the last three bits get set to 100
[13:00.200 --> 13:08.280]  to indicate that P1 is a record number. So we can show how this works by calling the formatBytes
[13:08.280 --> 13:16.480]  function with a field settings of 5 followed by 3 and we see indeed yes the first field is set to
[13:16.480 --> 13:30.080]  2 and the second field is set to 100. So I mentioned that this is a 97 byte long string
[13:30.820 --> 13:36.960]  that's in hexadecimal. If we convert that to decimal we get 151. So the response for record
[13:36.960 --> 13:47.900]  number 2 is 151 bytes. We're going to now try decoding this and the first step is by putting
[13:47.900 --> 13:52.960]  this into a Python variable. I'm going to call this response and we'll see this used quite a lot
[13:52.960 --> 14:05.060]  in the rest of the talk. So let's look at the first byte of this response. We know that this is a TLV
[14:06.320 --> 14:13.700]  string. First byte of a TLV string is the tag. So let's look at what that actually is.
[14:13.700 --> 14:22.540]  And if we look at that we will see that it is 70. So now let's go and try to decode this tag 70.
[14:24.760 --> 14:33.980]  Tags can be multiple bytes long. So here is the table for decoding the first byte of a tag.
[14:35.380 --> 14:41.860]  We can see that there are three different fields. One that's two bits long, one that's one bit long
[14:41.860 --> 14:54.240]  and then the rest is five bits long. If we now decode this hex 70 following the field specification
[14:54.920 --> 15:04.020]  we see that the first field is 01 and that corresponds to application class. Tags can be
[15:04.020 --> 15:11.160]  universal which means that every application that handles ESN1 should be able to understand it
[15:11.780 --> 15:19.940]  and it's not one of those. There's application where it is specific to a particular application
[15:19.940 --> 15:27.980]  so it could be smart card for example but it has the same meaning regardless of where it's used.
[15:27.980 --> 15:35.120]  It can be context specific which means that the meaning actually depends where this tag is present
[15:35.120 --> 15:39.780]  or it can be private in which case the specification doesn't deal with it. It's
[15:39.780 --> 15:46.460]  up to the producer of the card to define how this actually has meaning. Anyway this is 01
[15:46.460 --> 15:53.640]  which is application class which means it's specific to to smart cards. Now we need to find
[15:53.640 --> 16:01.350]  out what is the contents of this tag. This could be either primitive or it could be constructed.
[16:01.940 --> 16:08.220]  Primitive means that no further decoding is handled as part of TLV and constructed means that
[16:08.220 --> 16:15.560]  the contents is a series of TLV data items. Here we can see this field is set to one which means
[16:15.560 --> 16:23.320]  that is a constructed data object that means that the contents are going to be more TLV data items.
[16:24.720 --> 16:31.080]  And then the last five bits is just a number that encodes which particular tag that is.
[16:31.140 --> 16:42.200]  This is 1 0 0 0 0. As long as this is not all ones this is a one byte tag. So what we know is that
[16:42.200 --> 16:52.040]  tag 70 is a one byte tag containing TLV data items. And if we look this up in the UV specification
[16:52.040 --> 16:57.760]  we'd find out this is a read record response message template which is what we'd expect
[16:57.760 --> 17:07.250]  because we just called read record. Okay so now continuing on we've got the tag. Tag is 70.
[17:07.250 --> 17:16.430]  The next bytes are going to be the length. So let's look at the length and this is hex 81.
[17:16.810 --> 17:20.610]  You actually have to decode the length to find out what it means. So let's look at the table
[17:20.610 --> 17:31.490]  for decoding length. And what this says is that if bit 8 is not set then this is the length.
[17:31.490 --> 17:40.410]  But if bit 8 is set then this specifies the length of the length and that is going to follow.
[17:40.410 --> 17:47.210]  So here we can see that bit 8 is set and the rest of it is 1 which means that there's going to be
[17:47.210 --> 17:56.900]  an additional one byte that contains the actual length of this data item. So let's look at the
[17:56.900 --> 18:05.740]  next byte and this is hex 94 and that is the actual length of the contents of this response.
[18:05.940 --> 18:18.080]  Converting that into decibel is 148 bytes. So within tag 70 there are 148 bytes of further data
[18:18.080 --> 18:27.460]  and these are in the TLV format. Let's actually check whether this makes sense.
[18:27.620 --> 18:36.600]  So let's take these 148 bytes. We get to this by skipping the one byte tag and then the two
[18:36.600 --> 18:44.660]  bytes used for the length. And if we check the size of the whole response and take off those
[18:44.660 --> 18:52.100]  three bytes that we skipped over then indeed there's 148 bytes there. So this tag 70 contains
[18:52.100 --> 19:04.180]  everything that is part of the read record response. We know that this is a constructed tag
[19:04.180 --> 19:12.020]  so the contents are going to be a TLV item or one or more TLV items. So let's look at the
[19:12.020 --> 19:18.820]  first part of that. So the first byte is 8c and to find out what that actually
[19:18.820 --> 19:23.920]  means as a tag we need to decode it. So let's follow the same decoding pattern.
[19:28.010 --> 19:35.590]  So we see the first field is context specific so the meaning of this depends where it's actually
[19:35.590 --> 19:41.370]  found. This is as part of an EMV application so we have to use the EMV specification.
[19:42.170 --> 19:50.190]  The next bit is that it is a primitive data object so the ASM1 specification
[19:50.850 --> 19:54.330]  doesn't say anything more about how to decode that. We need to look elsewhere.
[19:54.790 --> 20:03.510]  And then the rest of it is not all ones so this is a one byte tag and the bits that are in this
[20:04.170 --> 20:11.190]  final field show us which of the tags it is. So to find the meaning of this we're going to need to
[20:11.370 --> 20:17.590]  look again into the EMV specification. It has something called a tag dictionary where you can
[20:17.590 --> 20:27.150]  see a list of tags and their meaning and their format. So this is 8c and then we can see that
[20:27.150 --> 20:35.490]  8c is the CardRiskManagementDataObjectList1. I'll go into what data object lists are. That's
[20:36.050 --> 20:48.190]  important data format within EMV. But let's look at the contents next and to do that we need to
[20:48.190 --> 20:56.410]  see the length. The length is 21 in hex. Now is this a one byte length or a multi-byte length?
[20:56.410 --> 21:04.950]  We need to decode it. Bit 8 is zero which means that this is a one byte length. So the size of
[21:04.950 --> 21:14.490]  this cdol1 record is 21 in hex bytes and then if we decode that we get 33. So there are going to be
[21:14.490 --> 21:27.050]  33 bytes following tag 8c which is the cdol1. So let's get that. We've got to skip over a few
[21:27.050 --> 21:34.050]  things. We've got to skip over tag 70, the two byte length, skip over tag 8c and then one byte
[21:34.050 --> 21:43.580]  length and take 33 bytes and then this is what we actually get. So we'll save the cdol1
[21:43.580 --> 21:53.420]  for later but now let's continue and see what is the next item within this response. So we're going
[21:53.420 --> 22:01.100]  to skip over everything we did before and also skip over the 33 bytes of data which is a cdol1
[22:01.100 --> 22:11.880]  and then the next thing we get is another tag. So this is 8d and then this is a one byte tag
[22:11.880 --> 22:19.700]  which is the cdol2. Then we look at the length. This is zero c that's a one byte length
[22:20.800 --> 22:26.840]  and then actually now let's get the contents of this tag. So this is
[22:28.300 --> 22:35.380]  zero c bytes starting at the offset by skipping over everything before and then this is now the
[22:35.380 --> 22:46.170]  cdol2. So we've got both the cdol1 and cdol2 which followed it. So I mentioned that cdol1
[22:46.170 --> 22:54.690]  is a doll object. Dolls are really quite important for EMV. They come about because
[22:55.330 --> 23:01.690]  EMV assumes that cards are very limited in what they can do. Maybe they cannot even decode TLV
[23:01.690 --> 23:08.290]  data themselves. The terminal on the other hand is more powerful so it can do a little bit more
[23:08.290 --> 23:15.190]  than the card. So when data is sent from the card to the terminal that's often in TLV format.
[23:15.570 --> 23:22.950]  But when data is sent from the terminal to the card generally TLV is not used because maybe the
[23:23.510 --> 23:33.190]  doesn't understand how to decode that sort of data item. Instead we use dolls. So a data object
[23:33.190 --> 23:41.010]  list is a set of tags and lengths and that tells the terminal how to format data that the card is
[23:41.010 --> 23:48.110]  going to receive. And then when the card receives this all it needs to do is jump to specific
[23:48.110 --> 23:55.490]  offsets within this set of data to know exactly what is the data item at that particular position
[23:55.490 --> 24:02.350]  without having to go through any more decoding steps. So one of these dolls consists of a list
[24:02.350 --> 24:07.050]  of tags and a list of lengths. And that tells the terminal when it's sending data to the card
[24:07.050 --> 24:12.810]  in this particular context all it should do is take this particular data item that the card
[24:12.810 --> 24:20.150]  requests, pad it to a particular size, and then send it to the card. And then keep on sending
[24:20.150 --> 24:27.970]  these data items of the particular length until all the tags covered in the doll have been sent
[24:27.970 --> 24:36.730]  over to the card. So the cdoll1 is used as the first step of the authorization process where the
[24:36.730 --> 24:45.010]  card is asked to produce an authorization request. If the card agrees to authorize that transaction
[24:45.010 --> 24:51.290]  it gets sent to the bank normally. And then the second stage of actually telling the card that the
[24:51.290 --> 24:57.970]  transaction has succeeded is what the cdoll2 is used for. The data items that are requested
[24:58.490 --> 25:03.410]  in each of these requests are a little bit different. That's why there's a cdoll1
[25:03.410 --> 25:12.090]  and a separate cdoll2. So now let's go actually into this cdoll1. It's a list of tags so let's
[25:12.090 --> 25:18.890]  look at the first byte of the cdoll1 and we get 9f. To find out what this actually means we will
[25:18.890 --> 25:27.790]  need to look at the table for decoding the first bytes of tags. Here we are and now we can see
[25:27.790 --> 25:37.570]  that the first field is 1 0 so it's context specific. This is a emv tag. The next bit is
[25:38.090 --> 25:45.850]  that it is a primitive data object so it's not TLV itself. And then the rest of the bits are
[25:45.850 --> 25:53.050]  1 1 1 1 1. If you haven't seen this before this means 9f is the beginning of a multi-byte tag.
[25:53.050 --> 25:59.970]  It's not a one byte tag like we've seen before. So to find out what else is going on in this tag
[25:59.970 --> 26:05.430]  we need to look at the next byte and we get this as 0 2. To interpret this we need a different
[26:05.430 --> 26:14.970]  table so let's go on to the second table. This says that if the first bit is zero then that is
[26:14.970 --> 26:20.930]  the last byte of the tag. If it's one there's going to be more coming. So here we see that the
[26:20.930 --> 26:29.610]  first bit is zero so this is a two byte tag. One and two byte tags are quite common. There's also
[26:29.610 --> 26:34.630]  some three byte tags particularly with the contactless specifications. In principle you
[26:34.630 --> 26:40.930]  can have tags as long as you might like but I've never seen one that is longer than three bytes.
[26:41.650 --> 26:49.710]  After the tag is the length and this is going to be six. So if we now look at the
[26:49.710 --> 26:56.970]  emv specification which shows us what the meaning of each tag is then we can see that
[26:59.210 --> 27:06.990]  9f02 is the amount authorized in numeric format. So the card wants to know when it's authorizing
[27:07.190 --> 27:13.530]  a transaction what is the amount it's going to authorize. So it's quite understandable and
[27:13.530 --> 27:19.070]  the card knows that this is the first field that's going to be sent. It knows that it's exactly six
[27:19.070 --> 27:25.570]  bytes long and it can then very easily compare this value to any limits that are specified
[27:26.110 --> 27:34.810]  in the card as to whether to authorize a transaction or not. Next let's go back to the
[27:34.810 --> 27:45.530]  response. At position 78 there is another tag this is two bytes long and it is 5f20. If we go
[27:45.530 --> 27:53.590]  and look what this actually means in the emv specification we can see that this is the card
[27:53.590 --> 28:03.630]  holder name and it is in alphanumeric format. So after the tag is going to be the length
[28:04.430 --> 28:18.310]  and this is 13 in hex which is 19 in decimal. If we now take these 19 bytes we get this and
[28:18.310 --> 28:26.090]  we can then decode it as ASCII and we get the card holder name. Card holder name is also present on
[28:26.090 --> 28:32.070]  the magnetic stripe and I'll talk about the relationship between the chip data and the
[28:32.070 --> 28:41.730]  magnetic stripe data a bit later. Okay let's keep on going. Next at offset 100 we have another tag
[28:41.730 --> 28:50.930]  this is 5f30 and this is the service code. The service code is another part of the emv chip data
[28:50.930 --> 28:57.510]  which was originally from the magnetic stripe standard. Let's go and have a look at what this
[28:57.510 --> 29:06.610]  says. After the tag is the length this is two and after the length is the two bytes that make up the
[29:06.610 --> 29:12.830]  content. From the specification we know that this is binary coded decimal. We know it is three digits
[29:12.830 --> 29:20.090]  long and with binary coded decimal it's left padded with zero so we ignore the first zero and we get
[29:20.090 --> 29:28.110]  201. Let's look at what this actually means. The first digit is two and this says that this is
[29:28.110 --> 29:36.210]  suitable for international transactions and it has a chip. This service code on the magnetic stripe
[29:36.210 --> 29:43.230]  is how terminals know that this card should have a chip if you swipe the magnetic stripe. That's why
[29:43.230 --> 29:50.290]  if you try that the terminal will say please use chip. Next digit is zero which says that it can
[29:50.290 --> 29:56.750]  be authorized in normal ways and the final digit is one which says that it can be used with or
[29:56.750 --> 30:07.070]  without pin and it can also be used for any type of transaction. Next let's look at offset 57 where
[30:07.070 --> 30:15.310]  we've got actually tag 57 by coincidence. This is the track 2 equivalent data. This is a copy
[30:15.310 --> 30:23.250]  of the track 2 of the magnetic stripe on the back of the card. It's there to allow the terminal to
[30:23.250 --> 30:29.650]  process a chip transaction as if it is mag stripe. Maybe it's because the network isn't able to
[30:29.650 --> 30:36.270]  process chip transaction or maybe the issuer has not yet upgraded their systems. So the terminal
[30:36.270 --> 30:42.770]  has all the information necessary to put together a copy of track 1 and track 2 of the magnetic
[30:42.770 --> 30:49.290]  stripe. For track 1 the terminal needs to take data items like the account number and the card
[30:49.290 --> 30:55.290]  or name and so on and then put that together. For track 2 there's actually a complete copy of track
[30:55.290 --> 31:00.130]  2 in one of these data fields and that's what you're seeing here. So we can look what this
[31:00.130 --> 31:09.330]  looks like. The start sentinel was missed out but then the next is the account number 16 binary
[31:09.330 --> 31:15.650]  coded decimal digits. Here I've masked out the middle of the digits because I've had too many
[31:15.650 --> 31:19.870]  conversations with my bank to say I needed a new card because my card number has been shown on
[31:19.870 --> 31:25.510]  television even though this card actually was cancelled quite a while ago. Then the separator
[31:25.510 --> 31:31.010]  normally this would be equal on track 2 but you can't do equal in binary coded decimal so they
[31:31.010 --> 31:41.670]  use d. Next is the expiry date so this is December 2015 so 1512. Then there's a service code which
[31:41.670 --> 31:47.910]  we've seen before 201. And finally there's the discretionary data. This varies between the
[31:47.910 --> 31:54.170]  issuer but most of them follow a similar process so we can guess what it actually means. The first
[31:55.050 --> 32:02.670]  five digits is the pin verification value. This is used for verifying whether a pin is correct
[32:02.670 --> 32:09.830]  without the card actually talking to the bank which knows the pin. This made sense when a lot
[32:09.830 --> 32:14.690]  of ATMs were not connected to any network. Nowadays it's obviously it's probably not used
[32:14.690 --> 32:22.430]  but it's still there. The first digit one which is the index for the key that is used for
[32:22.430 --> 32:29.550]  encrypting the PVV and then the rest which is 4079 is the PVV itself.
[32:30.210 --> 32:39.470]  After the PVV is the CVV car verification value and this is 927. This was a security feature
[32:39.470 --> 32:46.730]  introduced on magnetic stripe cards to help the bank detect whether a card is fake or not.
[32:47.170 --> 32:51.290]  A type of fraud that was happening quite frequently is people were getting
[32:51.290 --> 32:57.590]  card numbers and expiry dates and creating cloned magnetic stripe cards. The way that this was fixed
[32:57.590 --> 33:05.350]  is the CVV was written onto the magnetic stripe and it allowed the bank to see whether this is
[33:05.810 --> 33:12.230]  a genuine card or whether it was a card created with a copy of the account number and the expiry
[33:12.230 --> 33:19.310]  date that was obtained from a receipt perhaps. Now this actually creates a problem when chip
[33:19.310 --> 33:24.970]  transactions came along because if there was a copy of the CVV on the chip it meant someone who
[33:24.970 --> 33:30.050]  had data from the chip could then create a mag stripe and then completely bypass all the extra
[33:30.050 --> 33:37.750]  security features of the chip transaction. Eventually this got fixed round about in 2008
[33:38.270 --> 33:45.290]  where there was a different CVV that was copied onto the chip compared to the copy of the CVV
[33:45.290 --> 33:50.310]  which was on the mag stripe and that meant that in principle card issuers could tell the difference
[33:50.310 --> 33:56.510]  between a genuine mag stripe transaction and a mag stripe transaction that was made using
[33:56.510 --> 34:02.630]  chip data. Now it turned out that that didn't actually work out as well as hoped because what
[34:02.630 --> 34:09.410]  we've found out from some recent research from Leanne Galloway is that issuers are accepting
[34:09.410 --> 34:17.070]  transactions as mag stripe transactions even though the CVV is incorrect because the CVV
[34:17.070 --> 34:22.910]  was taken from the chip and not actually from the mag stripe. So I've tried to give you an overview
[34:22.910 --> 34:31.270]  of some of the issues involved in decoding TLV. You can see it's not easy to get right. There's
[34:31.270 --> 34:37.230]  different encodings used for different purposes in different contexts and even if you're using
[34:37.230 --> 34:43.030]  the right decoding format then there's lots of ways that things can go wrong. For example we've
[34:43.030 --> 34:50.450]  seen that tags can be almost indefinite lengths. It's very easy to encode extremely long lengths
[34:50.450 --> 34:57.950]  because you can say that I'm now going to send eight further bytes of length and then have
[34:58.590 --> 35:04.270]  massive lengths that are then processed by the TLV decoder and that could cause memory overflow
[35:04.270 --> 35:11.170]  or memory underflow errors. Because of all the ways that things go wrong, mistakes do happen
[35:11.170 --> 35:17.730]  and this not only has problems from mistakes but problems occur from the consequences of those
[35:17.730 --> 35:26.350]  mistakes. What I mean is that issuers who are banks sending out cards know that sometimes things will
[35:26.350 --> 35:31.290]  go wrong because of all these complexities of EMV and they will accept transactions even though
[35:31.290 --> 35:37.670]  something has gone wrong. Now more often than not that's not fraud it's just someone has made a
[35:37.670 --> 35:44.350]  mistake with encoding or decoding some data but sometimes it will be fraud and this complexity
[35:44.350 --> 35:50.950]  leading to mistakes and then leading to being very forgiving about errors actually makes the
[35:50.950 --> 35:59.070]  EMV system more vulnerable to fraud than it would be otherwise. So maybe you'd like to try to do this
[35:59.070 --> 36:05.530]  yourself. It's quite an interesting experience. There might be some specific need why you might
[36:05.530 --> 36:11.290]  have to do this but it's often very instructive to try to do something yourself even though you
[36:11.290 --> 36:16.770]  can get some code out there already. One of my colleagues Mike Bond said the only way to
[36:16.770 --> 36:22.670]  understand the wheel is to reinvent it and when I've been writing TLV decoders and other aspects
[36:22.670 --> 36:30.150]  of the EMV infrastructure that's helped me understand where I've slipped up and if I slipped
[36:30.150 --> 36:34.730]  up then there's a reasonable chance that someone else might slip up and that's been a very fruitful
[36:34.730 --> 36:42.390]  source of vulnerabilities in terms of my research. But if you don't want to do all the decoding
[36:42.390 --> 36:50.830]  yourself I've written a TLV decoder on the emvlab.org website where you can try this out.
[36:50.830 --> 36:55.070]  Here is the decoding of the data I've been talking about in this talk.
[36:55.070 --> 37:01.670]  So you can see that the cdol1 and cdol2 are there first. There's some data I didn't talk
[37:01.670 --> 37:08.450]  about in this talk. The application version number and the currency code, currency code exponent,
[37:08.450 --> 37:15.490]  and the ddol and the ICC public key. But you can also see the track two equivalent data is there,
[37:15.490 --> 37:22.850]  service code, the track one discretionary data which is necessary for formatting track one if
[37:22.850 --> 37:28.430]  you're going to make a magstripe transaction using the chip data. So that stuff is all there.
[37:28.430 --> 37:32.890]  This is just one of several records that are stored by the card. There's a lot more out there.
[37:32.890 --> 37:37.790]  If you want to try for yourself then get a smart card reader, try out Cardpeak and then see what
[37:37.790 --> 37:44.310]  you can find. So there's more information about me and my research on my website and if you're
[37:44.310 --> 37:49.770]  interested in updates from our research group then have a look at our blog on Bentham's Gates.
